Open set text recognition technology

نویسندگان

چکیده

开放环境下的模式识别与文字识别应用中,新数据、新模式和新类别不断涌现,要求算法具备应对新类别模式的能力。针对这一问题,研究者们开始聚焦开放集文字识别(open-set text recognition,OSTR)任务。该任务要求,算法在测试(推断)阶段,既能识别训练集见过的文字类别,还能够识别、拒识或发现训练集未见过的新文字。开放集文字识别逐步成为文字识别领域的研究热点之一。本文首先对开放集模式识别技术进行简要总结,然后重点介绍开放集文字识别的研究背景、任务定义、基本概念、研究重点和技术难点。同时,针对开放集文字识别三大问题(未知样本发现、新类别识别和上下文信息偏差),从方法的模型结构、特点优势和应用场景的角度对相关工作进行了综述。最后,对开放集文字识别技术的发展趋势和研究方向进行了分析展望。;Text recognition is focused on transcription-based image processing modeling in relevance to such domains like document digitization, content moderation, scene translation, automation driving, understanding, and other related contexts. Conventional techniques are often concerned about characters-seen more. However, two factors the training set of these methods yet be well covered, which novel character categories out-of-vocabulary (OOV) samples. Newly characters-related samples linked with OOV-based it may pay attention seen characters without combinations or For categories, internet-based environments can mainly used face unseen ligatures 1) emoticons unperceived languages, 2) scene-text environments, 3) from foreign region-specific languages. digitization profiling, undiscovered not involved as well. Since heterogeneity language format balanced, linguistic statistic data(e. g., n-gram, context, etc.) biased data gradually, challenged for vocabulary-high-correlated methods. The required yield three key scientific problems that affect costs efficiency open-world applications. oriented spotting capability, whereas characters-unseen rejected replace silent characters. Furthermore, popular open-set problem, leaked out mentioned below. First, emergence efficient many cases, re-training upon each occurrence costly, an incremental learning capability need strengthened after that. Second, amount received generalized zeroshot task. Third, Linguistic bias robustness yielded by OOV Due characterbased nature prediction, more possess handle some extent. capabilities constrained demonstrate strong vocabulary reliance because capacity models, recognition(OSTR) task feasible since existing tasks zero-shot model individual aspects only. This aims spot recognize characters, robust skews. As extension conventional task, OSTR retain a decent contents. In recent years, has been developing intensively context recognition. literature review carried its domains. It consists five background, genericity, concept, implementation, summary. we introduce application background analyze specific OSTR-derived cases. generic introduced brief preliminary less familiar researchers field. definition introduced, followed discussion relationship tasks, e. close-set Its implementation-wise, common frameworks first introduced. recognized derivations frameworks, where derivation based following:new category spotting, classes, robustness. Specifically, new problem refers rejecting come absent class given label set. Slightly different label-set straightfoward. Incremental terms non-retrained side information corresponding categories. slightly definition, excluded generative adversarial network(GAN)-based transductive approaches. holds original beyond stressed solution covered modeling-similar fields. evaluation cover datasets protocols contexts listed:1) multiple public available datasets, commonly metric measure performance, several protocols, typical methods, performance. Here, protocol compositions sets, testing metrics. summary, comparative analysis growth technical preference demonstrated. Finally, potnetialss trends future research directions predicted further.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Score normalisation applied to open-set, text-independent speaker identification

This paper presents an investigation into the relative effectiveness of various score normalisation methods for open-set, textindependent speaker identification. The paper describes the need for score normalisation in this case, and provides a detailed theoretical and experimental analysis of the methods that can be used for this purpose. The experimental investigations are based on the use of ...

متن کامل

Named Entity Recognition in Persian Text using Deep Learning

Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...

متن کامل

Review of Speech-to-Text Recognition Technology for Enhancing Learning

This paper reviewed literature from 1999 to 2014 inclusively on how Speech-to-Text Recognition (STR) technology has been applied to enhance learning. The first aim of this review is to understand how STR technology has been used to support learning over the past fifteen years, and the second is to analyze all research evidence to understand how Speech-to-Text Recognition technology can enhance ...

متن کامل

Learning a Neural-network-based Representation for Open Set Recognition

Open set recognition problems exist in many domains. For example in security, new malware classes emerge regularly; therefore malware classi€cation systems need to identify instances from unknown classes in addition to discriminating between known classes. In this paper we present a neural network based representation for addressing the open set recognition problem. In this representation insta...

متن کامل

Multi-class Open Set Recognition Using Probability of Inclusion

The perceived success of recent visual recognition approaches has largely been derived from their performance on classification tasks, where all possible classes are known at training time. But what about open set problems, where unknown classes appear at test time? Intuitively, if we could accurately model just the positive data for any known class without overfitting, we could reject the larg...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Image and Graphics

سال: 2023

ISSN: ['1006-8961']

DOI: https://doi.org/10.11834/jig.230018